Pattern Recognition of Strings Containing Traditional and Generalized Transposition Errors1

نویسندگان

  • B. J. Oommen
  • R. K. S. Loke
چکیده

We study the problem of recognizing a string Y which is the noisy version of some unknown string X* chosen from a finite dictionary, H. The traditional case which has been extensively studied in the literature is the one in which Y contains substitution, insertion and deletion (SID) errors. Although some work has been done to extend the traditional set of edit operations to include the straightforward transposition of adjacent characters2 [LW75] the problem is unsolved when the transposed characters are themselves subsequently substituted, as is typical in cursive and typewritten script, in molecular biology and in noisy chain-coded boundaries. In this paper we present the first reported solution to the analytic problem of editing one string X to another, Y using these four edit operations. A scheme for obtaining the optimal edit operations has also been given. Both these solutions are optimal for the infinite alphabet case. Using these algorithms we present a syntactic pattern recognition scheme which corrects noisy text containing all these types of errors. The paper includes experimental results involving subdictionaries of the most common English words which demonstrate the superiority of our system over existing methods.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Pattern Recognition of Strings with Substitutions, Insertions, Deletions and Generalized Transpositions1

We study the problem of recognizing a string Y which is the noisy version of some unknown string X* chosen from a finite dictionary, H. The traditional case which has been extensively studied in the literature is the one in which Y contains substitution, insertion and deletion (SID) errors. Although some work has been done to extend the traditional set of edit operations to include the straight...

متن کامل

Noisy Subsequence Recognition Using Constrained String Editing Involving Substitutions, Insertions, Deletions and Generalized Transpositions1

We consider a problem which can greatly enhance the areas of cursive script recognition and the recognition of printed character sequences. This problem involves recognizing words/strings by processing their noisy subsequences. Let X* be any unknown word from a finite dictionary H. Let U be any arbitrary subsequence of X*. We study the problem of estimating X* by processing Y, a noisy version o...

متن کامل

Noisy Subsequence Recognition Using Constrained String Editing Involving Arbitrary Operations*

We consider a problem which can greatly enhance the areas of cursive script recognition and the recognition of printed character sequences. This problem involves recognizing words/strings by processing their noisy subsequences. Let X* be any unknown word from a finite dictionary H. Let U be any arbitrary subsequence of X*. We study the problem of estimating X* by processing Y, a noisy version o...

متن کامل

Segmentation and recognition of connected handwritten numeral strings

-A new segmentation method for segmenting connected handwritten digit strings is presented. Unlike traditional methods where segmentation points are uniquely determined to cut the piece of stroke joining the connected numerals, our approach is one of identifying regions which serve as potential segmentation points. The regions are identified by a thorough analysis of the trajectory of strokes.....

متن کامل

Hereditarily Homogeneous Generalized Topological Spaces

In this paper we study hereditarily homogeneous generalized topological spaces. Various properties of hereditarily homogeneous generalized topological spaces are discussed. We prove that a generalized topological space is hereditarily homogeneous if and only if every transposition of $X$ is a $mu$-homeomorphism on $X$.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000